Thompson Sampling: Endogenously Random Behavior in Games and Markets

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning in Games by Random Sampling

We study repeated interactions among a ...xed set of “low rationality” players. Each player has a status quo action. Occasionally, he randomly samples other actions and changes his status quo if the sampled action yields a higher payo¤. This behavior generates a random process, the better-reply dynamics. ¤We thank seminar participants at Boston and Northwestern Universities for stimulating comm...

متن کامل

Endogenously Incomplete Markets: Macroeconomic Implications

Endogenously incomplete models derive restrictions on asset trading from primitive constraints on the enforcement and monitoring technologies available to societies. They have been applied to a wide variety of macroeconomic problems. This essay reviews some of these applications and the models that underpin them.

متن کامل

Freshness-Aware Thompson Sampling

To follow the dynamicity of the user’s content, researchers have recently started to model interactions between users and the ContextAware Recommender Systems (CARS) as a bandit problem where the system needs to deal with exploration and exploitation dilemma. In this sense, we propose to study the freshness of the user’s content in CARS through the bandit problem. We introduce in this paper an ...

متن کامل

Spectral Thompson Sampling

Thompson Sampling (TS) has surged a lot of interest due to its good empirical performance, in particular in the computational advertising. Though successful, the tools for its performance analysis appeared only recently. In this paper, we describe and analyze SpectralTS algorithm for a bandit problem, where the payoffs of the choices are smooth given an underlying graph. In this setting, each c...

متن کامل

Linear Thompson Sampling Revisited

We derive an alternative proof for the regret of Thompson sampling (TS) in the stochastic linear bandit setting. While we obtain a regret bound of order e O(d3/2 p T ) as in previous results, the proof sheds new light on the functioning of the TS. We leverage on the structure of the problem to show how the regret is related to the sensitivity (i.e., the gradient) of the objective function and h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: SSRN Electronic Journal

سال: 2017

ISSN: 1556-5068

DOI: 10.2139/ssrn.3061481